127 research outputs found

    Multiple genome alignment for identifying the core structure among moderately related microbial genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders.</p> <p>Results</p> <p>The method was applied to genome comparisons of two well-characterized families, <it>Bacillaceae </it>and <it>Enterobacteriaceae</it>, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, <it>i.e</it>., being indigenous and sharing the same history, more than the non-core genes.</p> <p>Conclusion</p> <p>The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.</p

    Domain Movement within a Gene: A Novel Evolutionary Mechanism for Protein Diversification

    Get PDF
    A protein function is carried out by a specific domain localized at a specific position. In the present study, we report that, within a gene, a specific amino acid sequence can move between a certain position and another position. This was discovered when the sequences of restriction-modification systems within the bacterial species Helicobacter pylori were compared. In the specificity subunit of Type I restriction-modification systems, DNA sequence recognition is mediated by target recognition domain 1 (TRD1) and TRD2. To our surprise, several sequences are shared by TRD1 and TRD2 of genes (alleles) at the same locus (chromosomal location); these domains appear to have moved between the two positions. The gene/protein organization can be represented as x-(TRD1)-y-x-(TRD2)-y, where x and y represent repeat sequences. Movement probably occurs by recombination at these flanking DNA repeats. In accordance with this hypothesis, recombination at these repeats also appears to decrease two TRDs into one TRD or increase these two TRDs to three TRDs (TRD1-TRD2-TRD2) and to allow TRD movement between genes even at different loci. Similar movement of domains between TRD1 and TRD2 was observed for the specificity subunit of a Type IIG restriction enzyme. Similar movement of domain between TRD1 and TRD2 was observed for Type I restriction-modification enzyme specificity genes in two more eubacterial species, Streptococcus pyogenes and Mycoplasma agalactiae. Lateral domain movements within a protein, which we have designated DOMO (domain movement), represent novel routes for the diversification of proteins

    Discovery of a novel restriction endonuclease by genome comparison and application of a wheat-germ-based cell-free translation assay: PabI (5′-GTA/C) from the hyperthermophilic archaeon Pyrococcus abyssi

    Get PDF
    To search for restriction endonucleases, we used a novel plant-based cell-free translation procedure that bypasses the toxicity of these enzymes. To identify candidate genes, the related genomes of the hyperthermophilic archaea Pyrococcus abyssi and Pyrococcus horikoshii were compared. In line with the selfish mobile gene hypothesis for restriction–modification systems, apparent genome rearrangement around putative restriction genes served as a selecting criterion. Several candidate restriction genes were identified and then amplified in such a way that they were removed from their own translation signal. During their cloning into a plasmid, the genes became connected with a plant translation signal. After in vitro transcription by T7 RNA polymerase, the mRNAs were separated from the template DNA and translated in a wheat-germ-based cell-free protein synthesis system. The resulting solution could be directly assayed for restriction activity. We identified two deoxyribonucleases. The novel enzyme was denoted as PabI, purified and found to recognize 5′-GTAC and leave a 3′-TA overhang (5′-GTA/C), a novel restriction enzyme-generated terminus. PabI is active up to 90°C and optimally active at a pH of around 6 and in NaCl concentrations ranging from 100 to 200 mM. We predict that it has a novel 3D structure

    Evolution in an oncogenic bacterial species with extreme genome plasticity: Helicobacter pylori East Asian genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genome of <it>Helicobacter pylori</it>, an oncogenic bacterium in the human stomach, rapidly evolves and shows wide geographical divergence. The high incidence of stomach cancer in East Asia might be related to bacterial genotype. We used newly developed comparative methods to follow the evolution of East Asian <it>H. pylori </it>genomes using 20 complete genome sequences from Japanese, Korean, Amerind, European, and West African strains.</p> <p>Results</p> <p>A phylogenetic tree of concatenated well-defined core genes supported divergence of the East Asian lineage (hspEAsia; Japanese and Korean) from the European lineage ancestor, and then from the Amerind lineage ancestor. Phylogenetic profiling revealed a large difference in the repertoire of outer membrane proteins (including <it>oipA</it>, <it>hopMN</it>, <it>babABC</it>, <it>sabAB </it>and <it>vacA-2</it>) through gene loss, gain, and mutation. All known functions associated with molybdenum, a rare element essential to nearly all organisms that catalyzes two-electron-transfer oxidation-reduction reactions, appeared to be inactivated. Two pathways linking acetyl~CoA and acetate appeared intact in some Japanese strains. Phylogenetic analysis revealed greater divergence between the East Asian (hspEAsia) and the European (hpEurope) genomes in proteins in host interaction, specifically virulence factors (<it>tipα</it>), outer membrane proteins, and lipopolysaccharide synthesis (human Lewis antigen mimicry) enzymes. Divergence was also seen in proteins in electron transfer and translation fidelity (<it>miaA, tilS</it>), a DNA recombinase/exonuclease that recognizes genome identity (<it>addA</it>), and DNA/RNA hybrid nucleases (<it>rnhAB</it>). Positively selected amino acid changes between hspEAsia and hpEurope were mapped to products of <it>cagA</it>, <it>vacA</it>, <it>homC </it>(outer membrane protein), <it>sotB </it>(sugar transport), and a translation fidelity factor (<it>miaA</it>). Large divergence was seen in genes related to antibiotics: <it>frxA </it>(metronidazole resistance), <it>def </it>(peptide deformylase, drug target), and <it>ftsA </it>(actin-like, drug target).</p> <p>Conclusions</p> <p>These results demonstrate dramatic genome evolution within a species, especially in likely host interaction genes. The East Asian strains appear to differ greatly from the European strains in electron transfer and redox reactions. These findings also suggest a model of adaptive evolution through proteome diversification and selection through modulation of translational fidelity. The results define <it>H. pylori </it>East Asian lineages and provide essential information for understanding their pathogenesis and designing drugs and therapies that target them.</p

    A deeply branching thermophilic bacterium with an ancient acetyl-CoA pathway dominates a subsurface ecosystem

    Get PDF
    <div><p>A nearly complete genome sequence of <em>Candidatus</em> ‘Acetothermum autotrophicum’, a presently uncultivated bacterium in candidate division OP1, was revealed by metagenomic analysis of a subsurface thermophilic microbial mat community. Phylogenetic analysis based on the concatenated sequences of proteins common among 367 prokaryotes suggests that <em>Ca.</em> ‘A. autotrophicum’ is one of the earliest diverging bacterial lineages. It possesses a folate-dependent Wood-Ljungdahl (acetyl-CoA) pathway of CO<sub>2</sub> fixation, is predicted to have an acetogenic lifestyle, and possesses the newly discovered archaeal-autotrophic type of bifunctional fructose 1,6-bisphosphate aldolase/phosphatase. A phylogenetic analysis of the core gene cluster of the acethyl-CoA pathway, shared by acetogens, methanogens, some sulfur- and iron-reducers and dechlorinators, supports the hypothesis that the core gene cluster of <em>Ca.</em> ‘A. autotrophicum’ is a particularly ancient bacterial pathway. The habitat, physiology and phylogenetic position of <em>Ca.</em> ‘A. autotrophicum’ support the view that the first bacterial and archaeal lineages were H<sub>2</sub>-dependent acetogens and methanogenes living in hydrothermal environments.</p> </div

    CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes

    Get PDF
    BACKGROUND: The recent accumulation of closely related genomic sequences provides a valuable resource for the elucidation of the evolutionary histories of various organisms. However, although numerous alignment calculation and visualization tools have been developed to date, the analysis of complex genomic changes, such as large insertions, deletions, inversions, translocations and duplications, still presents certain difficulties. RESULTS: We have developed a comparative genome analysis tool, named CGAT, which allows detailed comparisons of closely related bacteria-sized genomes mainly through visualizing middle-to-large-scale changes to infer underlying mechanisms. CGAT displays precomputed pairwise genome alignments on both dotplot and alignment viewers with scrolling and zooming functions, and allows users to move along the pre-identified orthologous alignments. Users can place several types of information on this alignment, such as the presence of tandem repeats or interspersed repetitive sequences and changes in G+C contents or codon usage bias, thereby facilitating the interpretation of the observed genomic changes. In addition to displaying precomputed alignments, the viewer can dynamically calculate the alignments between specified regions; this feature is especially useful for examining the alignment boundaries, as these boundaries are often obscure and can vary between programs. Besides the alignment browser functionalities, CGAT also contains an alignment data construction module, which contains various procedures that are commonly used for pre- and post-processing for large-scale alignment calculation, such as the split-and-merge protocol for calculating long alignments, chaining adjacent alignments, and ortholog identification. Indeed, CGAT provides a general framework for the calculation of genome-scale alignments using various existing programs as alignment engines, which allows users to compare the outputs of different alignment programs. Earlier versions of this program have been used successfully in our research to infer the evolutionary history of apparently complex genome changes between closely related eubacteria and archaea. CONCLUSION: CGAT is a practical tool for analyzing complex genomic changes between closely related genomes using existing alignment programs and other sequence analysis tools combined with extensive manual inspection

    Toward community standards in the quest for orthologs

    Get PDF
    The identification of orthologs—genes pairs descended from a common ancestor through speciation, rather than duplication—has emerged as an essential component of many bioinformatics applications, ranging from the annotation of new genomes to experimental target prioritization. Yet, the development and application of orthology inference methods is hampered by the lack of consensus on source proteomes, file formats and benchmarks. The second ‘Quest for Orthologs' meeting brought together stakeholders from various communities to address these challenges. We report on achievements and outcomes of this meeting, focusing on topics of particular relevance to the research community at large. The Quest for Orthologs consortium is an open community that welcomes contributions from all researchers interested in orthology research and applications. Contact: [email protected]
    corecore